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A  STUDY  OF  INDICATORS  FOR  IDENTIFYING  ZERO  VARIABLES 
IN  INTERIOR-POINT  METHODS* 

A.  S.  EL-BAKRY*,  R.  A.  TAPIA* ,  AND  Y.  ZHANG5 


Abstract.  In  this  study  we  are  concerned  with  constrained  optimization  problems  where  the  only 
inequality  constraints  are  nonnegativity  constraints  on  the  variables.  In  these  problems  the  ability  to 
identify  zero  variables  (binding  constraints)  early  on  in  an  iterative  method  is  of  considerable  value 
and  can  be  used  to  computational  advantage.  In  this  work  we  first  give  a  formal  presentation  of  the 
notion  of  indicators  for  identifying  zero  variables,  and  then  study  various  indicators  proposed  in  the 
literature  for  use  with  interior-point  methods  for  linear  programming.  We  present  both  theory  and 
experimentation  that  speaks  strongly  against  the  use  of  the  variables  as  indicators;  perhaps  the  most 
frequently  used  indicator  in  the  literature.  Our  study  implies  that  an  indicator  proposed  by  Tapia 
in  1980  is  particularly  effective  in  the  context  of  primal-dual  interior-point  methods.  We  also  study 
the  local  rate  of  convergence  for  several  indicators. 

Key  words.  Interior-point  methods,  Indicator  function,  Identifying  zero  variables 

AMS  subject  classifications.  65K,  49M,  90C 


1.  Introduction.  This  paper  describes  a  study  of  various  indicators  proposed  in 
the  literature  for  the  identification  of  zero  variables  in  linear  programming  problems. 
Our  particular  focus  will  be  on  indicators  that  can  be  used  in  conjunction  with  primal- 
dual  interior-point  methods.  We  consider  the  linear  programming  problem  in  the 
standard  form 

minimize  cT  x 

(1.1)  subject  to  Ax  =  b 

x  >  0, 

where  cgR"  ,  b  €  Rm,  A  G  Rmx”  (m  <  n)  and  A  has  full  rank  m.  The  first-order 
optimality  conditions  for  the  linear  program  (1.1)  are: 

/  Ax  —  b  \ 

(1.2)  F(x,y,  X)  =  AT X  +  y  —  c  =0 

\  XYe  / 

and 

(1.3)  (x,y)>  0 

where  X  =  diag(x),  Y  —  diag(y )  and  e  is  the  n-vector  of  all  ones.  A  point  ( x ,  y,  A)  is 
said  to  be  strictly  feasible  if  it  satisfies  Ax  =  b,  AT X  +  y  =  c  and  (x,  y)  >  0.  A  solution 
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pair  (x,y)  is  said  to  satisfy  strict  complementarity  if  in  addition  to  complementarity 
XY e  =  0,  it  satisfies  x  +  y  >  0. 

It  is  known  that  the  set  of  equations  and  inequalities  (1.2)-(1.3)  is  also  sufficient 
for  a  vector  x*  to  solve  problem  (1.1).  An  effective  method  for  solving  a  set  of 
nonlinear  equations  is  Newton’s  method.  It  is  well  known  that  Newton’s  method  not 
only  converges  but  also  converges  feist  if  it  starts  from  a  point  that  is  sufficiently  close 
to  a  solution  of  the  system.  On  the  other  hand  existing  theory  cannot  ensure  the 
convergence  of  Newton’s  method  if  it  is  started  far  away  from  a  solution.  Moreover 
it  is  not  clear  how  to  apply  Newton’s  method  to  a  system  of  nonlinear  equations 
and  inequalities.  Primal-dual  interior-point  methods,  that  have  the  basic  structure 
of  the  primal-dual  method  originally  proposed  by  Kojima,  Mizuno,  and  Yoshise  [21] 
based  on  earlier  work  of  Megiddo  [28],  can  be  viewed  as  perturbed  damped  Newton’s 
method  for  solving  (1.2).  The  right-hand  side  of  the  Newton  equation  is  modified 
(perturbed)  in  such  a  way  that  the  iterates  generated  by  the  algorithm  do  not  stray 
far  from  the  so-called  central  path.  For  a  thorough  study  of  the  central  path  in  interior- 
point  methods  in  linear  programming  see  Gonzaga  [13].  This  controlled  perturbation, 
which  is  a  consequence  of  y(xk ,  yk)e  in  Step  2  of  Algorithm  1  below,  ensures  the  global 
convergence  of  Newton’s  method  applied  to  (1.2).  In  the  damped  Newton’s  method 
the  steplength  may  be  chosen  less  than  one.  In  the  present  application  the  Newton 
step  is  damped  (see  Steps  3  and  4  of  Algorithm  1  below)  so  that  positive  iterates  are 
maintained.  This  can  be  accomplished  provided  that  the  initial  iterate  is  positive.  It 
is  well  known  that  the  damping  of  the  Newton  step  may  have  an  adverse  effect  on  the 
fast  local  convergence  of  Newton’s  method. 

The  algorithmic  framework  for  such  primal-dual  interior-point  methods  has  the 
following  form 

Algorithm  1  (Primal-Dual  Interior-Point  Method). 

Given  a  strictly  feasible  point  (x°,y°,  A0).  For  k  =  0, 1, . . do 

1.  Choose  <rk  £  (0, 1)  and  set  y(xk  ,yk)  =  ak^x  . 

2.  Solve  the  following  system  for  (Axk ,  Ayk ,  AXk): 

(1.4)  F'(xk,yk ,  \k)(Ax,  Ay ,  AX)  =  —F(xk,yk,Xk)  +  fi(xk,yk)e 

3.  Choose  a  step-length  ak  =  min(l,rid*:)  for  rk  £  (0, 1)  and 

_  _ ^1 _ 

a  min((A*)_1  Axk ,  (Y*)-1  Ayk) 

4.  Form  the  new  iterate 

(xk+\yk+1,Xk+1)  =  (xk,yk,Xk)  +  ak (Axk ,  Ayk ,  AXk). 

In  Step  2  e  =  (0, ...  ,0, 1, ... ,  1)T,  with  n  +  m  zero  components. 

Note  that  the  choice  of  step-length  ak  guarantees  (xk+1,  yk+1)  >  0.  Moreover,  it 
can  be  easily  verified  that 

(1.5)  (xk+1)Tyk+1  =  (1  -  a*(l  -  ak))(xk)T yk  . 

The  two  algorithmic  parameter  sequences  {crk}  and  { rk }  play  an  essential  role  in 
the  global  and  local  theory  as  well  as  in  the  implementation  of  interior-point,  meth¬ 
ods.  Originally,  several  choices  for  {< Tk }  were  proposed  to  ensure  global  (polynomial) 
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convergence.  For  example  in  Monteiro  and  Adler’s  path-following  primal-dual  algo¬ 
rithms  [35], 


a 


k 


where  k  E  (0,  y/n)  is  chosen  to  satisfy  a  certain  restriction.  In  Todd  and  Ye’s  primal- 
dual  potential  reduction  algorithm  [42] 


where  v  is  a  positive  constant.  In  Mizuno,  Todd  and  Ye’s  algorithm  [34]  <rk  is  chosen 
to  be  0  and  1  alternatively.  However  their  algorithm  does  not  fit  in  the  framework  of 
the  generic  algorithm  given  above,  so  we  do  not  discuss  their  algorithm  in  this  work. 
All  of  these  choices  were  proposed  to  guarantee  a  polynomial  bound  on  the  complexity 
of  these  algorithms.  In  spite  of  their  good  theoretical  properties,  these  choices  of  the 
parameters  exhibit  poor  numerical  performance,  see,  for  example,  El-Bakry,  Tapia 
and  Zhang  [9]. 

Impressive  numerical  results  were  reported  by  Choi,  Monma  and  Shanno  [6], 
McShane,  Monma  and  Shanno  [27]  and  Lust.ig,  Marsten  and  Shanno  [24]  for  primal- 
dual  interior-point  methods  that  use  very  small  values  for  <jk  and  take  long  steps  (i.e. 
large  values  for  rk).  The  best  results  reported  so  far  come  from  an  implementation 
by  Lustig,  Marsten  and  Shanno  [25]  of  a  primal-dual  predictor-corrector  algorithm. 
In  this  implementation  Lustig,  Marsten  and  Shanno  [25]  choose 

rk  =  0.99995 


and 


(1.6) 


f  if  n<  5000 

i  -V,  if  n  >  5000 

k  riy/n  ) 


when  the  duality  gap  is  relatively  small.  Motivated  by  these  impressive  numerical 
results  Zhang,  Tapia  and  Dennis  [52]  studied  the  rate  of  convergence  of  primal- 
dual  interior-point  methods.  They  were  able  to  establish  superlinear  and  cjuadratic 
convergence  to  zero  of  the  duality  gap  sequence  {xk  yk  }  for  iterates  generated  by 
Algorithm  1  for  appropriate  choices  of  the  sequences  {crk}  and  {rfc}.  Their  work 
sparked  a  new  research  direction  in  interior-point,  methods,  namely  local  convergence 
rate  analysis  of  interior-point  methods.  The  essential  conditions  in  the  fast  local 
convergence  theory  of  interior-point  methods  are  that  the  perturbation  to  the  first 
order  necessary  conditions  be  phased  out  fast  (<Ti  — ►  0)  and  that  the  iterates  approach 
the  boundary  of  the  positive  orthant  (t*  — +  1).  The  question  of  whether  primal-dual 
interior-point  methods  with  fast  local  convergence  can  also  have  a  polynomial  bound 
on  the  number  of  iterations  was  answered  in  the  affirmative  by  Zhang  and  Tapia  [51]. 

One  of  the  main  tasks  in  many  methods  for  solving  inequality  constrained  op¬ 
timization  problems  is  the  identification  of  constraints  that  are  active  (or  binding) 
at  a  solution  of  the  problem.  Such  identification  removes  the  combinatorial  aspect, 
of  the  problem  brought  on  by  the  presence  of  inequality  constraints  and  reduces  the 
problem  to  an  equality  constrained  problem.  For  linear  programming  problems  in 
the  standard  form  (1.1)  the  inequality  constraints  are  the  nonnegativity  constraints 
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on  the  variables.  So  binding  constraints  correspond  to  variables  that  are  zero  at  a 
solution  of  problem  (1.1).  We  use  the  notation 

Z(x)  —  {i  :  Xi  =  0,  1  <  i  <  n} 

to  denote  the  set  of  indices  of  zero  variables  at  a  feasible  point  x  of  problem  (1.1). 
Notice  that  Z(x*)  may  be  different  for  different  solutions  x*  of  the  same  linear  pro¬ 
gramming  problem.  However,  from  Theorem  2.2  of  Section  2  we  know  that  Z(x*) 
is  invariant  with  respect  to  solutions  x*  in  the  relative  interior  of  the  solution  set  of 
(1.1).  Hence  in  this  case  we  may  denote  the  set  of  indices  of  zero  variables  at  any 
solution  x*  in  the  relative  interior  of  the  solution  set  by  Z*  and  no  confusion  will 
arise. 

This  paper  is  organized  as  follows.  The  structure  of  the  solution  set  of  the  linear 
programming  problem  is  studied  in  Section  2.  In  Section  3,  we  define  the  indicator 
function  and  list  some  properties  that  a  good  indicator  should  possess.  In  Section  4  we 
study  one  of  the  earliest  indicators  proposed,  and  probably  the  most  frequently  used 
indicator,  for  identifying  2(x*),  namely  the  variables  used  as  indicator.  We  demon¬ 
strate  both  theoretically  and  numerically  that  this  indicator  has  serious  disadvantages. 
The  primal-dual  indicator  which  has  been  used  recently  by  several  researchers  is  inves¬ 
tigated  in  the  context  of  primal-dual  interior-point  methods  in  Section  5.  In  Section 
6  we  study  the  Tapia  indicators  for  the  linear  programming  problem  and  discuss 
their  behavior  in  several  interior-point  methods.  Section  7  is  devoted  t.o  establish¬ 
ing  the  rate  of  convergence  of  several  indicators.  Numerical  experiments  are  given 
in  Section  8.  These  numerical  experiments  include  the  study  of  the  usefulness  of  the 
variables  as  indicators,  as  well  as  the  usefulness  of  the  primal-dual  indicator  in  primal- 
dual  interior-point  methods.  They  also  include  comparisons  between  the  variables  as 
indicators,  the  primal-dual  indicator,  and  the  Tapia  indicator.  Concluding  remarks 
are  given  in  Section  9.  Finally  we  list,  in  Appendix  A,  several  indicators  that  have 
been  proposed  for  interior-point  methods. 

2.  Structure  of  the  Solution  Set.  The  structure  of  the  solution  set  of  the 
linear  programming  problem  will  play  an  important  role  in  explaining  the  behavior 
of  certain  indicators.  For  this  reason  we  begin  this  study  with  an  investigation  of 
the  structure  of  the  solution  set  and  in  particular  the  distribution  of  strict  comple¬ 
mentary  solutions  within  the  solution  set.  We  establish  our  main  result  for  a  larger 
class  of  problems,  namely  for  monotone  complementarity  problems,  since  the  proof  is 
essentially  the  same  as  that  for  linear  programming. 

First,  we  need  some  preliminary  concepts.  Following  McLinden  [26]  and  Giiler 
and  Ye  [15],  by  the  support  a(v)  for  v  £  R"  we  mean  the  set  of  indices  of  positive 
components  of  v,  i.e. 


< t(v )  =  {i  :  Vi  >  0}. 

In  particular,  the  support  of  a  vector  with  no  positive  components  is  the  empty  set. 
Consider  the  partial  order  ■<  on  Rn  defined  by 

v  <u  if  ff(v)  C  <t(u). 

Two  vectors  u  and  v  are  said  to  be  equivalent ,  denoted  by  u  ~  v,  if  u  ■<  v  and  v  <  u. 
An  element  v  £  U  C  Rn  is  said  to  be  a  -<-  maximal  element  of  U  if 

u  ~ 


ueU  and  v  -<  u 


v. 
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It  is  obvious  that  any  subset  U  of  Rn  has  at  least  one  ^-maximal  element.  Giiler 
and  Ye  [15]  made  the  following  straightforward  but  key  observation  in  revealing  the 
structure  of  the  solution  set  T . 

Observation  2.1.  If  U  C  Rn  is  convex,  then  all  maximal  elements  of  U  are 
equivalent. 

In  particular,  the  above  observation  implies  that  if  U  is  a  convex  subset  of  the 
positive  orthant  of  R" ,  then  the  zero  structure  of  the  maximal  elements  of  U  is 
invariant. 

A  multivalued  mapping  T  :  Rn  — ►  2Rn  is  said  to  be  a  monotone  operator  if 

y£T(x)  and  y  £  T(x)  =>  (y  -  y)T (x  -  x)  >  0. 

A  monotone  operator  T  is  said  to  be  maximally  monotone  if  the  graph  G(T )  = 
{(ar,  j/)  £  Rn  x  Rn  :  y  £  T(a;)}  is  not  properly  contained  in  the  graph  of  any  other 
monotone  operator.  The  complementarity  problem  associated  with  a  maximally  mono¬ 
tone  operator  T  is 

(2.1)  Find  (x,y)  £  G(T)  such  that  xTy  =  0  and  (x,y)>  0. 

The  set  of  solutions  of  problem  (2.1)  will  be  denoted  by  T.  Consider  the  set 

M  —  {(xm,ym)  £  T  :  ( xm ,ym )  is  maximal }. 

The  following  proposition  is  a  consequence  of  Lemma  2.3  in  Giiler  [14] 

Proposition  2.1  (Guler).  The  solution  set  T  for  the  monotone  complemen¬ 
tarity  problem  (2.1)  is  convex. 

The  convexity  of  T  implies,  by  Observation  2.1,  that  the  zero  structure  of  maximal 
solutions  (or  solutions  that  satisfy  strict  complementarity  if  they  exist)  is  invariant. 
The  invariance  of  the  zero  structure  of  solutions  that  satisfy  strict  complementarity 
(which  are  maximal  in  complementarity  problems)  was  proved  for  a  special  class  of 
linear  programming  problems  by  Charnes,  Cooper,  and  Thrall  [4], 

We  denote  the  relative  interior  of  a  set  U  by  ri  U.  See  Rockafellar  [37]  for  a 
definition  of  relative  interior.  Now  we  state  our  main  result  concerning  the  structure 
of  the  solution  set  T  of  problem  (2.1). 

THEOREM  2.2.  Assume  that  the  solution  set  T  of  the  monotone  complementar¬ 
ity  problem  (2.1)  is  nonempty.  Then  the  relative  interior  of  T  is  exactly  the  set  of 
maximal  elements  of  T . 

Proof.  By  Theorem  6.2  of  Rockafellar  [37],  ri  (T)  is  nonempty  and  convex.  Let 
cl  (T)  and  drT  =  (cl  ( T))\(ri  (T))  denote  the  closure  and  the  relative  boundary  of 
T,  respectively.  There  exists  at  least  one  maximal  element  (xm ,  ym )  of  T  that  lies  in 
ri  (T).  Choose  an  arbitrary  point  ( xb,yb )  £  drT .  Consider  the  convex  combination 

(x*,y*)  =  <t>(xb,yb)  +  (\-4>)(xm,ym)  0<*<  1. 

It  follows  directly  from  Theorem  6.1  of  Rockafellar  [37]  that  (xA,y^)  £  ri  (T).  We 
have 


a(x\  i/)  =  a(x\  yb)  U  a(xm ,  ym)  =  a(xm ,  ym), 


which  shows  that 


(**,/)  eA4,  *G[0,1). 
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Now  consider  any  point  (x*,y‘)  G  ri  (T).  Convexity  of  T  implies  that  ( x‘,y* )  lies  on 
some  line  segment  connecting  (xm,  ym)  and  some  point  (xb ,  yb)  G  drT .  But  we  have 
seen  that  all  these  line  segments  lie  in  Ad.  Thus  (x* ,  y')  G  M.  This  proves  that 

(2.2)  ri  T  CM. 

Now  we  will  show  that  any  point  in  the  relative  boundary  of  T  cannot  lie  in  Ad. 
Consider  ( xb,yb )  G  drT .  Suppose  (xb,yb)  G  Ad.  Let  ( x',y ’)  G  ri  (T).  From  (2.2), 
(x\y')  G  Ad  and  hence  (x*,  y*)  ~  (xb,yb).  Without  loss  of  generality,  assume  that 

<K*‘)  =  {1,  •  •  ■>»•}  =  ‘K*6) 

and 

<x(y‘)  =  2n}  =  <r(yi), 

where  r  <  s.  Now  any  point  on  the  relative  boundary  of  T  is  the  intersection  of  the 
graph  G(T )  with  at  least  one  of  the  hyperplanes 

X{  =  0  i  =  1, . . . ,  r 

or 

yi  =  0  *  =  2»r. 

This  contradicts  the  maximality  of  (xb ,yb)  and  completes  the  proof.  □ 

It  is  known  that  the  primal-dual  formulation  of  the  linear  programming  problem 
can  be  stated  as  a  monotone  complementarity  problem  with  graph 

(2.3)  Glp{T)  =  {(x,  y)  :  Ax  =  b,  AT X  +  y  =  c  for  some  X  G  R  m } ■ 

For  more  details  see  Giiler  and  Ye  [15].  Consider  the  set  S  consisting  of  the  solutions 
of  the  linear  programming  problem  that  satisfy  strict  complementarity.  Clearly  any 
element  of  S  is  maximal  in  T.  As  an  immediate  consequence  of  Theorem  2.2,  we  can 
determine  the  distribution  of  S  within  T  for  linear  programming  problems. 

Corollary  2.3.  Assume  that  there  exists  at  least  one  point  ( x°,y° )  G  Glp{T) 
such  that  ( x°,y° )  >  0.  Then  the  relative  interior  of  T  is  exactly  the  set  of  solutions 
of  the  linear  programming  problem  that  satisfy  strict  complementarity. 

Proof.  The  well-known  Goldman-Tucker  theorem  states  that  5^0  for  linear 
programming  problems.  The  proof  now  follows  directly  from  Theorem  2.2.  DThis 
result  concerning  the  structure  of  the  solution  set  of  the  linear  programming  problem 
can  also  be  derived  from  a  study  of  the  structure  of  the  solution  set  of  the  linear 
complementarity  problem  carried  out  by  Jansen  and  Tijs  [18].  In  Sections  3  and  4,  we 
discuss  the  effect  that  the  structure  of  the  solution  set  has  on  the  behavior  of  certain 
indicators. 

A  straightforward  implication  of  Corollary  2.3  and  Lemma  2  of  Giiler  and  Ye  [15] 
is  that  if  an  interior-point  method  generates  iterates  ( xk,yk )  that  satisfy 

min(X*Yfce) 

(xk)T yk  ~ 

where  7  is  positive  number,  then  {(x^y*)}  cannot  have  limit  points  that  lie  on 
the  relative  boundary  of  the  solution  set.  In  particular  this  means  that  interior-point 
methods  that  satisfy  such  a  bound  cannot  generate  iterates  which  converge  to  a  vertex 
solution  of  the  linear  programming  problem,  unless  of  course  the  problem  has  a  unique 
solution. 
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3.  The  Indicator  Function  :  Definition  and  History.  Following  Tapia  [38], 
we  use  the  term  indicator  to  denote  a  function  that  identifies  constraints  that  are 
active  at  a  solution  of  a  constrained  optimization  problem.  Although  indicators  have 
been  used  extensively  since  at  least  1984  in  the  context  of  linear  programming,  a 
unified  framework  that  includes  a  definition,  desired  properties,  and  general  guidelines 
for  their  use  has  not  been  provided.  It  is  a  main  objective  of  this  work  to  provide 
such  a  framework. 

Throughout  this  paper  we  will  consider  iterative  procedures  of  the  generic  form 

zk  +  l  _  zk  +  ak^zk 


The  majority  of  our  discussion  centers  around  primal-dual  interior-point  methods  and 
in  this  case  our  iterates  have  the  form 


zk  =  (xk,yk,Xk). 

In  a  primal  method,  e.g.,  the  Karmarkar  algorithm,  we  have  zk  =  xk .  It  is  natural 
to  define  the  indicator  as  a  function  of  zk  and  A zk  and  perhaps  an  auxiliary  variable 
which  may  represent  the  step-length  ak  or  other  quantities.  However,  in  the  interest 
of  conciseness  we  will  consider  the  auxiliary  variable  implicitly  in  the  definition  and 
not  formally  state  its  dependence. 

Let  (zk,  Azk)  be  generated  by  an  iterative  procedure  of  the  generic  form  discussed 
above.  By  an  indicator  function  /  we  mean  any  function  which  assigns  to  (zk ,  Azk) 
an  n-vector  of  extended  reals  I(zk ,  Azk)  and  satisfies  the  property  that  if  zk  — >  z* , 
then  for  i  =  1 , . . . ,  n 


(3.1) 


lirri  Ii(zk,Azk) 


<j>i,  if  i  6  Z(x*) 
9i,  if  i<jLZ(x*) 


for  some  0,-  and  </>,-  satisfying  min,-  0,  >  max,-  4>i  ■ 

Throughout  this  paper  we  use  the  terms  indicator  and  indicator  function  inter¬ 
changeably.  Whenever  it  is  appropriate  and  no  confusion  will  arise,  we  write  I(xk), 
I(yk),  or  I(xk,yk)  instead  of  I(zk ,  Azk).  We  also  use  the  term  indicator  to  denote 
the  n-vector  I(zk,Azk)  or  any  of  its  components. 

It  is  desirable  that  an  indicator  function  I  possess  the  following  ideal  properties: 

1.  the  sharp  separation  property 


min  9i  »  max 

igZ(x’)  i€2(i 7") 

2.  the  uniform  separation  property 

9(  =  6  ;  i  (£.  Z(xm)  and  (j>i  =  <j>  ;  i  €  Z(x *) 

for  some  constants  6  and  tf>.  In  this  case,  it  is  also  desirable  that  6  and  <f>  be  independent 
of  both  the  solution  and  the  problem; 

3.  the  indicator  is  inexpensive  to  compute; 

4.  the  indicator  sequence  {I(zk ,  Azk)}  converges  to  its  limit  faster  than  {zk} 
converges  to  z*\ 

5.  the  indicator  gives  reliable  information  early  on  in  the  iterative  process; 

6.  the  indicator  is  scale  independent,  i.e.  it  does  not  change  if  the  variables  are 
scaled  by  any  positive  diagonal  matrix. 
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Clearly,  an  indicator  may  be  effective  and  not  possess  all  these  ideal  properties.  How¬ 
ever,  it  is  our  considered  opinion  that  both  the  sharp  and  the  uniform  separation 
properties  are  extremely  important.  Several  numerical  experiments  demonstrating 
the  importance  of  these  two  properties  are  given  in  Section  8. 

If  some  members  of  Z(x*),  for  some  solution  x* ,  are  identified  early  on  in  an 
iterative  procedure,  then  this  information  can  be  used  to 

1 .  reduce  the  dimension  of  the  problem  by  dropping  the  columns  of  the  matrix 
A  corresponding  to  the  zero  variables.  This  reduction  may  result  in  significant  savings 
in  computational  work. 

2.  help  recover  an  optimal  basis  for  the  linear  program  using  techniques  along 
the  lines  of  Megiddo  [29] . 

3.  help  obtain  very  accurate  solutions  using  projection  methods  as  proposed  by 
Ye  [49] .  It  can  also  be  used  to  help  obtain  a  vertex  solution  using  random  perturbation 
methods  as  proposed  by  Mehrotra  [31]. 

4.  help  in  “column  generation”  methods  where  the  algorithm  starts  with  a  small 
set  of  constraints  and  adds  new  potentially  active  constraints  at  each  iteration.  Several 
methods  are  proposed  in  the  literature  for  that  purpose,  see,  for  example  Mitchell  [32], 
Mitchell  and  Todd  [33],  Goffin  and  Vial  [12],  Dantzig  and  Ye  [7],  Vaidya  [45],  Atkinson 
and  Vaidya  [2],  and  den  Hertog,  Roos,  and  Terlaky  [17].  For  a  theoretical  analysis 
of  a  column  generation  and  deletion  long-step  logarithmic  barrier  algorithm,  see  den 
Hertog,  Roos,  and  Terlaky  [16]. 

The  task  of  predicting  Z(x*)  has  been  considered  in  recent  years  by  many  re¬ 
searchers  and  various  indicators  have  been  proposed  for  this  purpose.  Gill  et  al  [11], 
Karmarkar  and  Ramakrishnan  [19],  McShane,  Monma  and  Shanno  [27],  Tone  [44], 
Lustig,  Marsten,  and  Shanno  [24],  Dantzig  and  Ye  [7],  and  Boggs,  Domich,  Donaldson 
and  Witzgall  [3],  among  others,  proposed  the  use  of  variables,  either  primal  or  dual, 
to  predict  members  of  Z(x*).  Tapia  [38]  introduced  two  indicators  in  the  context, 
of  identifying  active  constraints  in  nonlinear  constrained  optimization  problems.  Ko- 
jima  [20]  proposed  an  indicator  for  use  in  Karmarkar-type  algorithms.  Ye  [47]  and 
Todd  [41]  introduced  two  indicators  for  Karmarkar-type  and  primal-dual  algorithms. 
Choi  and  Goldfarb  [5]  proposed  two  indicators  similar  to  the  Todd-Ye  indicators  with 
the  advantage  that  their  indicators  can  be  used  in  algorithms  that  are  not  necessarily 
interior-point  methods.  Tapia  and  Zhang  [39]  proposed  an  indicator  that  can  be  used 
in  primal,  dual,  or  primal-dual  interior-point  methods.  Kovacevic-Vujcic  [22]  intro¬ 
duced  an  indicator  that  is  superlinearly  faster  than  the  variables  in  Karmarkar-type 
methods.  The  ratio  of  primal  variables  and  dual  slacks  was  used  as  an  indicator  by 
several  researchers  including  Gay  [10],  Ye  [49],  and  Lustig  [23].  Mehrotra  [30]  used 
an  indicator  based  on  the  relative  change  in  the  dual  slack  variables.  Resende  and 
Veiga  [36]  used  the  reciprocal  of  the  dual  slack  variables  as  indicators.  Many  of  these 
indicators  have  been  cataloged  in  Appendix  A  along  with  some  critical  comments. 

4.  The  Variables  as  Indicators.  In  both  linear  and  nonlinear  programming 
the  use  of  the  variables  as  indicators  is  a  part  of  the  optimization  folklore.  In  linear 
programming,  Gill  et  al.  [11]  set  primal  variables  with  very  small  absolute  values 
to  zero.  Karmarkar  and  Ramakrishnan  [19]  suggested  using  the  dual-slack  variables 
as  indicators.  McShane,  Monma  and  Shanno  [27]  suggested  setting  those  variables 
with  small  absolute  value  and  large  dual  slack  to  zero.  Boggs,  Domich,  Donaldson 
and  Witzgall  [3]  used  the  primal  slacks  with  large  values  to  remove  constraints  from 
the  problem  using  an  algorithm  based  on  the  method  of  centers.  While  this  indica¬ 
tor  is  readily  accessible,  it  has  serious  disadvantages.  It  does  not  satisfy  either  the 
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sharp  separation  or  the  uniform  separation  property  and  is  scale  dependent.  Another 
disadvantage  is  that  in  general  it  does  not  give  information  soon  enough  to  save  com¬ 
putational  work  or  improve  the  performance  of  the  algorithm.  Some  researchers  were 
aware  of  the  deficiencies  of  this  indicator  and  therefore  tried  to  use  it  in  a  conser¬ 
vative  manner.  Unfortunately,  the  undesirable  aspects  of  this  indicator,  namely  the 
lack  of  both  the  sharp  and  uniform  separation  properties,  are  inherent  in  the  conver¬ 
gence  particulars  of  interior-point  methods  and  in  the  structure  of  the  solution  set  of 
the  problem.  In  the  following,  we  demonstrate  the  detrimental  effect  that  these  two 
factors  can  have  on  the  behavior  of  the  variables  when  used  as  indicators. 

1.  The  effect  of  convergence  particulars  of  interior-point  methods: 

A  main  difficulty  in  using  an  indicator  function  arises  when  a  threshold  sequence 
{sLo}  is  to  be  used  in  the  identification  test, 

(4-1)  /,•(**)  <  bkzero  =>■  x*  =  0. 

Since  for  this  indicator  Ii(xk)  —  xk ,  the  sequence  {6kero}  must  satisfy 

max  x,-  <  6*r„  <  min  x; 
i€2(r*)  igZ(x') 

where  b*eT0  =  lim)fc_00  Skero  (assuming  that  the  limit  exists)  and  x  is  the  approximate 
solution  given  by  the  algorithm.  In  order  to  identify  zero  variables  early  on,  the 
threshold  sequence  should  satisfy 

(4.2)  max  xj  <  6kro  <  min  xj, 

for  k  >  K,  where  K  is  a  relatively  small  positive  integer.  Since  max,.?z(I.)  ii  and 
min,^2(j,.)  Xj  are  not  known  a  priori,  it  is  very  difficult  to  construct  a  sequence  {bkero} 
that  satisfies  these  conditions.  In  our  numerical  studies  we  often  found  that  there  was 
a  large  gap  between  components  of  the  final  approximate  solution  generated  by  the 
interior-point  method  and  the  components  of  x*.  In  fact,  we  observed  numerically  the 
annoying  phenomenon  that  for  a  final  approximate  solution  generated  by  an  interior- 
point  method,  we  may  have 


max  Xj  >  min  Xj. 

*€  2  (a?*)  *"£2  (a?*) 

This  shows  that,  in  practice,  a  threshold  sequence  {bkero}  that  satisfies  (4.2)  may  not 
exist. 

2.  The  effect  of  the  structure  of  the  solution  set: 

An  implication  of  Corollary  2.3  is  that  if  (x6,  yb)  is  a  point  on  the  relative  boundary  of 
the  solution  set  T,  then  there  exits  at  least  one  component,  say  xj,  such  that  xj  —  0, 
while  xj  >  0  for  all  xs  £  ri(T).  The  convexity  of  T  implies  that  xj  is  arbitrarily  small 
for  solutions  xs  arbitrarily  close  to  the  boundary  while  the  corresponding  dual  slacks 
are  zero.  So  the  use  of  variables  as  indicators  may  be  misleading  even  in  the  presence 
of  both  primal  and  dual  information.  Finally,  the  geometry  of  the  problem  may  be 
such  that  T  is  so  thin  that  some  positive  component  x*  has  a  very  small  value  for  all 
solutions  in  the  relative  interior  of  T. 

Numerical  experiments  with  the  variables  as  indicator  are  presented  in  Section  8. 
These  experiments  speak  strongly  against  the  use  of  variables  as  indicators. 
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5.  The  Primal-Dual  Indicator.  Consider  the  indicator  function 
(5.1)  I(zk,Azk,ak)  =  {Yk+1)~lXk+1e, 


where  zk+1  =  zk  +  akAzk,  Y  =  diag(y)  and  X  =  diag(x).  We  will  call  this  indicator 
the  primal-dual  indicator  since  it  uses  both  primal  and  dual  information.  This  indica¬ 
tor  was  used  recently  by  several  researchers,  e.g.  Gay  [10],  Ye  [49]  and  Lustig  [23].  If 
strict  complementarity  holds,  then  the  primal-dual  indicator  satisfies  both  the  sharp 
and  the  uniform  separation  properties,  namely 


lim  I(xk ,  yk ,  ak 

k—*oo 


0 

oo 


if  ieZ(x*) 
if  i  £  Z(x*) 


The  primal-dual  indicator  does  not  require  nondegeneracy  or  feasible  iterates.  Unfor¬ 
tunately,  it  is  scale  dependent.  Another  disadvantage  is  that  the  identification  test 
Ii(zk ,  Azk ,  ak)  <  6zer0  =>■  x*  =  0  is  sensitive  to  the  choice  of  6zero.  Our  numeri¬ 
cal  experiments  show  that  the  primal-dual  indicator  gives  reliable  information  only  if 
the  iterates  are  very  close  to  a  solution.  In  fact  in  many  cases  this  indicator  could 
not  identify  all  zero  variables  even  at  the  final  approximate  solution  generated  by 
the  interior-point  algorithm.  These  observations  motivated  us  to  study  in  detail  the 
structure  of  the  primal-dual  indicator  in  primal-dual  interior-point,  methods. 


5.1.  The  Primal-Dual  Indicator  in  Interior-Point  Methods.  In  the  frame¬ 
work  of  primal-dual  interior-point  methods,  the  behavior  of  the  primal-dual  indicator 
can  be  explained  using  the  following  proposition. 

Proposition  5.1.  Assume  that  the  sequence  of  iterates  {(xk ,yk ,  \k)}  has  been 
generated  by  Algorithm  1.  Then  for  i  =  1, . . . ,  n 
~k+l 

(5.2)  Hhr  =  (2- 


y , 


Vi 


,1+1 


— T  + 

yf 


<rk  (xk)Tyk 

n  yfy-+1 


Proof.  The  perturbation  of  the  linearized  complementary  slackness  equation  gives 


Thus 


X(y  +  a  Ay)  +  Y(x  4-  c*Ax)  =  (2  —  <x)XY e  +  ay{x,  y)e. 


XkYk+1e  +  YkXk+1e  =  (2  -  ak)XkYke  +  aknke. 

Multiplying  both  sides  by  (Yk)-1(Yt+1)-1  completes  the  proof.  □ 

Examining  equation  (5.2),  we  believe  that  the  undesirable  behavior  of  the  primal- 
dual  indicator  is  due  to  one  of  the  following  situations: 

(i)  If  xk  —*  x*  >  0  for  some  i,  then  y*  =0.  In  this  case 


x- 

—r—*00  —  OO, 

yf 


which  is  essentially  an  undefined  quantity;  and  it  is  not  clear  that  the  primal-dual 
indicator  will  approach  infinity  fast  enough  to  be  of  effective  use. 

(ii)  If  xf  — ►  x*  =  0  for  some  i,  then  y *  >  0.  In  this  case  each  of  the  three 
terms  on  the  right-hand  side  of  (5.12)  tends  to  zero.  However,  if  y*  >  0,  but  has  a 
small  value  (e.g.  10-4)  then  the  denominator  of  the  third  term  will  be  much  smaller 
and  could  cause  the  primal-dual  indicator  to  have  large  values.  This  problem  can  be 
partially  corrected  if  we  let  ak  — ►  0  fast.  Unfortunately,  if  y*  >  0  but  has  a  very  small 
value,  which  may  occur  as  argued  in  Section  2,  then  all  three  terms  in  (5.2)  become 
small  only  when  the  iterates  are  extremely  close  to  a  solution. 
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6.  The  Tapia  Indicator.  In  order  to  identify  active  and  inactive  constraints 
for  a  nonlinear  constrained  optimization  problem,  Tapia  [38]  suggested  using  the  quo¬ 
tient  of  successive  Lagrange  multipliers  and  the  quotient  of  successive  slack  variables 
as  indicators  in  the  context  of  an  iterative  procedure  that  enforces  linearized  comple¬ 
mentarity. 

For  linear  programming  problems,  the  Tapia  indicators  are 

(6.1)  Ip{xk)  =  (Xk)-1Xk+1e, 
and 

(6.2)  Id(yk)  =  e-(y*)-1y‘+1c, 

where  xfc+1  —  xk  +  Ax ,  yt+1  —  yk  +  Ay  and  e  =  (1, . . . ,  1)T. 

El-Bakry  [8]  studied  the  behavior  of  the  Tapia  indicators  in  primal-dual  interior- 
point  methods.  He  also  used  both  indicators  to  identify  and  remove  zero  variables 
in  primal-dual  interior-point  methods  using  a  successive  projection  method,  where 
the  iterates  are  successively  projected  onto  the  hyperplanes  of  active  constraints  once 
these  hyperplanes  are  identified.  Some  of  these  results  are  presented  in  Section  8. 
Mehrotra  [31]  used  the  Tapia  primal  indicator  Ip  in  his  perturbation  method  to  iden¬ 
tify  a  vertex  solution  using  interior-point  methods.  The  properties  and  the  behavior 
of  these  indicators  in  various  interior-point  methods  are  discussed  in  detail  in  Sec¬ 
tion  6.1.  For  the  sake  of  completeness  we  present  the  following  proposition  which 
establishes  the  sharp  and  uniform  separation  properties  for  the  Tapia  indicators.  It  is 
essentially  a  specialization  to  problem  (1.1)  of  a  general  result  proved  by  Tapia  [38], 

Proposition  6.1  (Tapia).  Assume  that  the  sequence  of  iterates  {(xk ,yk ,  Xk)}, 
generated  by  an  iterative  procedure,  converges  to  a  strict  complementary  solution 
{(**,  j/*,  A*)}  of  the  first-order  necessary  conditions  for  problem  (1.1).  Assume  further 
that  linearized  complementarity 

XkAyk  +YkAxk  =  -XkYke 


is  satisfied.  Then  for  i  =  1, . . . ,  n 


(6.3) 

if  —  {  1  0 

Fi  €  Z(x*) 
r  i  Z(x*) 

and 

(6.4) 

t L”  -•*>-{  1 

if  i  €  Z(x*) 
ifitZ(x') 

where  xk+1 

=  xk  +  A Xi  and  yk+1  =  yf  +  Ay,-. 

It  is  obvious  from  the  above  proposition  that  Axk/xk  — *•  0  for  i  0  Z(x*),  and 
Axk /xk  — ►  —  1  for  i  £  2(x*).  Hence,  the  relative  change  Axk/xk,  which  is  merely 
a  restatement  of  the  Tapia  indicator,  can  also  serve  as  an  indicator.  We  emphasize 
that  in  general  the  relative  change  is  not  a  good  indicator,  a  fact  well-known  from  the 
elementary  theory  of  sequences.  In  the  context  of  linear  programming  problems,  it  is 
the  linearized  complementarity  and  the  additional  condition  of  strict  complementarity 
that  make  the  Tapia  indicators  (or  equivalently  the  relative  change)  effective.  How¬ 
ever,  in  the  primal-dual  interior-point  methods,  the  iterates  satisfy  a  perturbation  of 
the  linearized  complementarity  equation,  namely 


X Ay  +  Y Ax  =  —  XY e  +  //(x,  y)e. 
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The  question  as  to  whether  the  Tapia  indicators  can  retain  their  useful  properties  in 
that  framework  is  addressed  below. 

6.1.  The  Tapia  Indicators  in  Interior-Point  Methods.  It  was  observed,  in 
our  numerical  experiments,  that  the  Tapia  indicator  is  more  effective  than  the  primal- 
dual  indicator  in  identifying  zero  variables  in  most  test  problems,  see  Section  8.  This 
led  us  to  investigate  the  structure  of  the  Tapia  indicators  in  the  framework  of  primal- 
dual  interior-point  methods  for  linear  programming.  As  we  shall  soon  see  the  fit  is 
surprisingly  good. 

Proposition  6.2.  Consider  a  sequence  of  iterates  {(xk,yk,\k)}  generated  by 
Algorithm  1.  Assume  that 

1.  ( xk)Tyk  —  0. 

2.  e ^  >  7  >  0  for  all  k  and  some  y. 

3.  The  algorithmic  parameters  ak  and  rk  have  been  chosen  so  that 

<rk  — *■  0  and  Tk  — ►  1. 


Then  for  i  =  1 , ,n 


and 


lim 

k—+oo 


(  o  if  i  e  z(x*) 
\  1  ifi<£Z(x*) 


if  i  e  z(x*) 
ifi$Z(x *) 


where  xt+1  =  xk  +  (5k  Ax  and  yk+l  =  yk  +  fik  A y  for  any  f3k  £  [ak ,  1]  with  ak  given 
in  Step  3  of  Algorithm  1. 

Proof.  Consider 


XAy  +  Y Ax  =  —XY e  -f  y(x,  y)e. 


It  is  clear  that 


X(y  +  (3 Ay)  +  7(x  +  pAx)  =  (2  -  f3)XY e  +  /3fi(x,  y)e. 


Hence 


(6.5)  (X*)-1x,;+1-i-(y*:)-1t/t+1  =(2-/3k)e  +  pka' 


*k(*k)Tyk 


{XkYk)~le. 


Very  recently  Tapia,  Zhang  and  Ye  [40]  demonstrated  that  (Axk ,  Ayk)  — ^  0;  hence 

(6.6) 


As ?  Auk 

0  for  i  ^  Z(x*)  and  — - *•  0  for  i  6  Z(x*). 


Vi 


It  follows  from  (6.6)  and  the  definition  of  ak  that  d1  -»  1.  Hence  ak  — ►  1  and  therefore 
/?*  — *•  1.  The  result  follows  now  from  (6.5)  and  Assumptions  2  and  3.  D 

Remarks: 

1.  The  consistency  of  the  three  assumptions  of  Proposition  6.2  is  proved  for  a 
large  class  of  primal-dual  interior-point,  algorithms  in  Zhang  and  Tapia  [50]. 
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2.  The  fact  that  — »•  1 ,  under  the  assumptions  of  Proposition  6.2,  motivated 
us  to  use  /?fc  =  1  in  the  calculation  of  the  Tapia  indicators.  This  proved  to  give 
superior  results  over  /3k  =  ak  in  our  numerical  experiments. 

3.  According  to  the  theory  of  Tapia,  Zhang  and  Ye  [40],  the  assumptions  of 
Proposition  6.2  are  not  sufficient  to  ensure  that  the  iteration  sequence  generated  by 
Algorithm  1  converges.  To  ensure  the  convergence  of  the  iteration  sequence,  according 
to  their  theory,  Assumption  3  in  Proposition  6.2  must  be  strengthened  so  that  crk  — *  0 
at  least  R-linearly,  see  Tapia,  Zhang  and  Ye  [40]. 

4.  It  is  satisfying  to  us  that  the  conditions  which  guarantee  the  usefulness  of 
the  Tapia  indicators,  i.e.  conditions  1-3  in  Proposition  6.2,  are  exactly  the  conditions 
which  guarantee  fast  local  convergence,  i.e.  superlinear  convergence  of  the  duality 
gap  sequence  to  zero  (see  Zhang,  Tapia  and  Dennis  [52]  and  Zhang  and  Tapia  [50]). 
It  is  equally  satisfying  that  we  obtain  this  pleasant  behavior  of  the  Tapia  indicators 
without  the  assumption  that  the  iteration  sequence  converges,  as  Tapia  [38]  assumed 
in  Proposition  6.1. 

5.  It  is  clear,  from  the  proof  of  Proposition  6.2,  that  the  Tapia  indicators  con¬ 
verge  under  the  conditions  that  (1)  linearized  complementarity  is  approached  suffi¬ 
ciently  fast,  (2)  all  limit  points  of  the  iteration  sequence  satisfy  strict  complementarity 
and  have  the  same  zero-nonzero  structure,  and  (3)  the  sequence  {(Axk ,  Ayk)}  con¬ 
verges  to  zero.  It  is  not  hard  to  see  that  these  assumptions  are  sufficient,  for  the 
convergence  of  the  Tapia  indicators  to  their  0-1  limits  (defined  by  (6.3)  and  (6.4)) 
even  for  nonlinear  programming  problems. 

6.  We  emphasize  that  if  the  condition  ak  — ►  0  is  replaced  by  the  conditions 
the  iteration  sequence  converges  to  a  solution  in  ri(T)  and  that  <rk  — *  a  >  0,  then 
the  0-1  separation  property  (6.3)  is  not  retained.  In  other  words  the  convergence 
of  the  iteration  sequence  in  the  primal-dual  interior-point,  methods  is  not  enough  to 
guarantee  that  the  Tapia  indicators  have  the  0-1  separation  property.  The  assumption 
<rk  — *  0  is  crucial  in  this  context. 

Now  we  discuss  the  use  of  the  Tapia  indicator  in  several  interior-point,  methods. 
The  search  directions  in  many  interior-point,  methods  satisfy  the  following  system  of 
equations  (see  Ye  [48]) 


DxAx  +  DyAy  =  it— — — e  —  XYe, 
n 


AAx.  =  0, 


At  AX  +  Ay=0. 

Different  interior-point  methods  correspond  to  different  choices  of  Dx  and  Dy .  The 
first  equation  is  of  particular  interest,  in  our  analysis.  If  Dx,  Dy  >0  then  this  equation 
can  be  written 

(6.7)  D~l  Ax  +  D-'Ay  =  a^-D^D^e  -  D~k D~k XY e. 

n  *  y 


In  order  for  the  Tapia  indicators  to  retain  their  effectiveness  we  should  have: 
Dy1  Ax  +  D~1Ay  -»  X-1  Aa-  +  Y"1  Ay 
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and 


<t~~D~1D~1  +  D;lD~lXYe  -  -e. 

Examining  the  different  choices  for  Dx,  Dy  and  a  for  the  primal  (or  dual)  affine  scaling 
algorithms,  the  primal  (or  dual)  potential-reduction  algorithms  and  the  primal  (or 
dual)  path-following  algorithms,  it  is  easy  to  see  that  it  is  extremely  unlikely  that  the 
Tapia  indicators  will  retain  their  0-1  separation  property  in  these  contexts. 

In  conclusion,  we  believe  that  primal-dual  algorithms  where  crk  — ►  0  are  the 
natural  setting  for  the  use  of  the  Tapia  indicators.  Moreover,  in  this  case  the  Tapia 
indicators  and  the  primal-dual  interior-point  methods  are  an  excellent  match. 

Finally  we  list  the  properties  of  the  Tapia  indicators  that  make  them  effective  in 
practice  as  demonstrated  in  Section  8.  These  properties  are  as  follows. 

1.  They  are  inexpensive  to  compute. 

2.  They  satisfy  both  the  uniform  and  the  sharp  separation  properties.  The 
indicator  parameters  <j>  =  0  and  6—1  are  independent  of  the  problem. 

3.  From  our  numerical  experience,  the  Tapia  indicators  give  reliable  information 

early. 

4.  They  are  scale  independent  when  the  variables  are  scaled  by  any  positive 
diagonal  matrix. 

5.  They  do  not  require  feasibility  or  nondegeneracy. 

6.  They  do  not  require  convergence  of  the  iteration  sequence. 

7.  Rates  of  Convergence.  In  this  section  we  study  the  rate  of  convergence  of 
several  indicators.  Studying  the  behavior  of  indicators  is  more  subtle  than  studying 
the  behavior  of  other  sequences  generated  by  Algorithm  1  in  several  respects.  Among 
them  we  emphasize  the  following 

1.  The  behavior  of  the  components  of  the  indicator  function  is  more  important 
than  the  general  behavior  of  the  indicator  vector  itself. 

2.  The  ability  of  an  indicator  to  identify  elements  of  Z(x*)  far  from  a  solution 
is  more  important  than  fast  local  convergence  of  this  indicator.  In  other  words  it  is 
the  global  behavior  of  the  indicator  that  makes  it  effective,  not  its  fast  local  behavior. 
The  difficulty  in  studying  the  global  behavior  of  indicators  is  essentially  the  absence  of 
a  qualitative  measure  of  this  behavior.  In  the  absence  of  such  a  qualitative  measure, 
numerical  experimentation  may  be  used  to  study  the  global  behavior  of  the  indicator  in 
an  attempt  to  generate  concrete  mathematical  conjectures  that  explain  this  behavior. 

Now  we  proceed  in  our  study  of  the  convergence  rate  of  the  variables  as  indicator, 
the  primal-dual  indicator  and  the  Tapia  indicators.  First  we  state  the  result  for  the 
variables  as  indicator. 

PROPOSITION  7.1.  Consider  a  sequence  of  iterates  {(xk,yk,  A*)}  generated  by 
Algorithm  1.  Assume  that 

1.  ( xk)Tyk  — >  0. 

2.  e ^  >  7  >  0  for  all  k  and  some  j. 

3.  The  algorithmic  parameters  <jk  and  rk  have  been  chosen  so  that 

<rk  — >  0  and  rk  — ►  1. 


Then  for  each  i  £  Z(x*) 


xk  — *  0,  Q-superlinearly. 
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If  in  addition  the  algorithmic  parameters  crk  and  rk  have  been  chosen  so  that 

(7.1)  <7*  =  0({xkTyk)t)  and  rk  =  l-0((xkTyk)t), 

for  some  £  £  (0, 1],  then  there  exists  x*  £  ri(T)  such  that  for  all  i  ^  Z(x*) 


with  R-rate  1  +  f. 

Proof.  For  each  i  £  Z(x*),  Assumptions  1  and  2  guarantee  the  convergence  of 
{x*}  to  zero,  see  Tapia,  Zhang,  and  Ye  [40].  By  Proposition  6.2  we  have 

x^+1 

— r - »  0,  for  i  £  Z(x*), 

xi 

which  proves  the  first  part. 

If  {<rk}  and  {rk}  are  chosen  according  to  (7.1),  then  Theorem  4.1  in  Tapia,  Zhang, 
and  Ye  [40]  implies  the  convergence  of  the  iteration  sequence  generated  by  Algorithm 
1  to  a  solution  x*  in  the  relative  interior  of  the  solution  set.  T  of  problem  (1.1).  The 
rest  of  the  proof  follows  from  Theorem  3.2  in  Zhang  and  Tapia  [50],  □ 

It  may  seem  surprising  that  an  indicator  possessing  Q-superlinear  convergence, 
componentwise,  does  not  demonstrate  good  behavior  in  practice.  The  discussion  in 
Section  4  on  the  effect  of  convergence  particulars  of  interior-point  methods  as  well 
as  the  effect  of  the  structure  of  the  solution  set  on  the  behavior  of  the  variables  as 
indicators  explains  this  phenomenon;  essentially  we  have  superlinear  convergence  to 
a  misleading  value. 

Now  we  prove  that  the  components  of  the  primal-dual  indicator  that  converge  to 
zero  do  so  Q-superlinearly.  In  this  case  the  poor  practical  behavior  of  the  primal-dual 
indicator  is  due  to  its  lack  of  scale  independence  and  to  the  structure  of  the  solution 
set  as  discussed  in  Section  5. 

PROPOSITION  7.2.  Consider  a  sequence  of  iterates  {(xk,yk,  A*)}  generated  by 
Algorithm  1.  Assume  that 

1.  (**)V  —  o. 

2.  ^  >  7  >  0  for  all  k  and  some  7. 

3.  The  algorithmic  paramaters  ok  and  rk  have  been  chosen  so  that 

ak  — *  0  and  r*  -*  1. 


Then  for  i  £  Z(x*), 


where  x*+1  =  xk  4-  f3k  Ax  and  yk+1  =  yk  +  0k  Ay  for  any  0k  £  [ak ,  1]  with  ak  given 
m  Step  3  of  Algorithm  1. 

Proof.  For  i  £  Z{x*)  we  have 


PD- 

PD--1 


*£!_ y± 

yki+1  4 


i  £  Z(x*). 


The  result  follows  directly  from  Proposition  6.2. 


□ 
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It  is  worth  mentioning  here  that  the  Qi  factor  of  the  primal-dual  indicator  is  the 
quotient  of  the  Tapia  primal  indicator  to  the  Tapia  dual  indicator. 

Now  we  consider  the  rate  of  convergence  of  the  Tapia  indicators. 

PROPOSITION  7.3.  Consider  a  sequence  of  iterates  {(a:*,  yk ,  A*)}  generated  by 
Algorithm  1.  Assume  that 

1.  (xk)Tyk  —  0. 

2.  !^0ryr~k  >  7  >  0  for  all  k  and  some  y. 

3.  The  algorithmic  paramaters  ak  and  rk  have  been  chosen  so  that 


ak  =  0((xkTyk)t)  and  Tk  =  1  —  0{{xkT yk)^)  for  some  £6(0,1]. 


Then 


Ti(xk) 


f  0  if  i  6  Z(x*) 
\  1  ifi$Z(x *) 


with  an  R-rate  of  convergence  of  l+£,  where  xk+l  =  xk  +/3kAx  and  t/*"*"1  =  yk  +/3k  Ay 
for  any  (3k  6  [«*,  1]  with  ak  given  in  Step  3  of  Algorithm  1. 

Proof.  Assumptions  1-3  guarantee,  by  Theorem  4.1  in  Tapia,  Zhang  and  Ye  [40], 
the  convergence  of  the  iterate  sequence  generated  by  Algorithm  1  to  a  solution  x*  in 
the  relative  interior  of  the  solution  set  of  problem  (1.1).  By  Theorem  3.2  in  Tapia, 
Zhang  and  Ye  [40],  we  obtain 


\Ti(xk)  -  1 


\0kAxk\ 

xk 


'■ 


(/?V)V  +  /?'v*), 


for  some  positive  constants  /?'  and  f3" .  On  the  other  hand, 


|Tj(**)|  <  \Ti(yk)-l\  +  ak 


( xk)Tyk /n 
*iVi 


i  6  Z(**). 


The  remainder  of  the  proof  follows  from  Assumptions  (2)  and  (3)  and  the  above  two 
inequalities.  □ 

It  is  not  clear  that  a  corresponding  Q-rate  result  can  be  proved  for  the  Tapia  indi¬ 
cators.  Although  the  above  result  is  satisfying,  it  may  seem  to  fall  short  of  explaining 
the  good  behavior  of  the  Tapia  indicators.  In  the  next  proposition  we  try,  at  least 
partially,  to  explain  this  behavior. 

Proposition  7.4.  Consider  a  sequence  of  iterates  {{xk ,  yk ,  A*)}  generated  by 
Algorithm  1.  Assume  that 

1.  ( xk)Tyk  — ►  0. 

2-  e ^  >  7  >  0  for  all  k  and  some  y . 

3.  The  algorithmic  parameters  ak  and  rk  have  been  chosen  so  that 

ak  —  0((xkT yk)()  and  rk  —  1  —  0((xkT yk)^)  for  some  £  6  (0, 1]. 

Then  for  i  =  1 , ,n 


Xj*+i  y{k+1 

- r 1 - r - 1  — >  0,  Q-superlinearly, 

xC  yp 

where  xk+1  =  xk  +  (3k  Ax  and  yk+1  —  yk  +  (3k  Ay  for  any  (3k  6  [«*,  1]  with  ak  given 
in  Step  3  of  Algorithm  1. 
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Proof.  Define 


Then 


hence 


....  fc+i 

*.  !/. 


1/*  +  !  |  _  (T^1  a^  +  lT^  +  lT/g.^  +  ly,.*  +  l 


l/f 


x*  yk  /xikyik 


i/?*1 1  _ 

Wl  ^VT/n*K*  ’ 

where  C  is  some  positive  constant.  From  Assumption  2  we  have 


kT  , 
x*  y 


<  - 


Xikyik  7 


Also  we  have 


M  £ 

x  y 

XikXJik  ~ 


>  1. 


Together  the  last  three  inequalities  imply  that 


and  completes  the  proof.  0 

It  is  well-known  that  if  a  sequence  of  reals  converges  at  least  Q-linearly,  then 
eventually  the  sequence  of  errors  is  strictly  monotone  decreasing.  This  is  not  neces¬ 
sarily  the  case  for  R-linear  convergence.  Hence  Propositions  7.3  and  7.4  imply  that 
while  the  error  sequence  of  the  Tapia  primal  indicator  or  the  Tapia  dual  indicator  is 
not  necessarily  monotone  decreasing,  the  sum  of  the  two  error  sequences  will  eventu¬ 
ally  be  strictly  monotone  decreasing.  This  fact  speaks  for  using  the  two  indicators  in 
tandem. 


8.  Numerical  Experience  .  In  this  section  we  present  several  numerical  exper¬ 
iments  with  three  indicators:  the  variables  as  indicators,  the  primal-dual  indicator, 
and  the  Tapia  indicator.  The  purpose  of  these  experiments  is  to  demonstrate  the 
undesirability  of  variables  as  indicator,  to  study  the  behavior  of  the  primal-dual  in¬ 
dicator  in  primal-dual  interior-point  methods,  and  finally  to  compare  between  the 
ability  of  the  three  indicators  to  identify  zero  variables  in  linear  programming.  These 
experiments  are  performed  on  a  subset  of  the  netlib  test  set  using  a  predictor-corrector 
primal-dual  interior-point  code  that  was  developed  at  Rice  University.  The  code  gen¬ 
erates  a  sequence  of  iterates  that  approach  feasibility  and  drive  the  absolute  duality 
gap  cT x  —  bT y  to  zero.  For  numerical  purposes  our  stopping  criterion  is  stated  in 
terms  of  the  relative  gap  c1^ryf ,  rather  than  the  absolute  gap.  We  will  say  that  a 

problem  is  solved  to  an  accuracy  of  10-<i  for  some  positive  integer  d  if  the  algorithm 
is  terminated  when 

f  \cTxk  —  bT yk  |  ||Ac*  -6||!  \\AT \k  +  yk  -  c||x  \ 

V  i  +  |fc7VI  ’  i  +  ||z*||i  ’  i  + 1| a* ill  +  ii^Hi/ 


max 


<  1(T'2. 
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In  the  following  experiments  all  problems  are  solved  to  an  accuracy  of  10-8  unless 
otherwise  specified.  This  choice  agrees  with  the  default  choice  for  the  stopping  crite¬ 
rion  in  many  interior-point  codes.  The  experiments  were  performed  on  a  Sun  4/490 
workstation  with  64  Megabytes  of  memory. 

8.1.  The  Undesirability  of  Variables  as  Indicator.  Historically,  the  vari¬ 
ables  were  probably  the  first  indicators  used  to  predict  Z(x*).  In  Section  4  we  dis¬ 
cussed,  in  detail,  the  disadvantages  of  this  indicator.  In  our  numerical  experiments, 
we  observed  that  in  many  test  problems  from  the  netlib  collection  it  was  extremely 
difficult,  and  sometimes  even  impossible,  to  distinguish  between  zero  and  nonzero 
variables,  in  an  approximate  solution  generated  by  primal-dual  interior-point  meth¬ 
ods,  using  only  the  values  of  these  variables.  In  the  following  we  will  investigate  some 
ideas  that  have  been  proposed  for  using  the  variables  to  determine  Z(x*). 

(i)  Set  Xi  =  0  if  xf  <  S2ero. 

Gill  et  al.  [11]  chose  6zero  —  10-8.  In  many  cases  the  algorithm  terminated  with 
most  of  the  zero  variables  having  values  greater  than  10-5  ,  e.g.  SHARE1B  and 
SCAGR25  (in  fact  some  problems  had  zero  variables  of  order  10~2  when  the  algorithm 
terminated,  e.g.  SCAGR25).  So,  this  choice  is  very  conservative.  If  we  use  6zero  — 
10-6  some  of  the  nonzero  primal  variables  in  PILOT4  and  some  of  the  nonzero  dual 
elements  in  CYCLE  have  values  less  than  this  threshold.  So,  a  good  choice  of  S2ero 
is  extremely  difficult  to  find.  We  also  observed  that  for  a  particular  problem  zero 
elements  may  have  a  wide  range  of  magnitude  at  the  approximate  solution  generated 
by  the  primal-dual  interior-point  method.  For  example  in  GREENBEA  the  zero 
variables  in  the  approximate  solution  have  magnitudes  ranging  from  10-1  to  10~5. 

(ii)  Set  Xi  =  0  if  xf  <  6X  and  yf  >  6y . 

From  our  experience  with  the  neilib  problems  we  observed  that  the  final  approximate 
solution  generated  by  a  primal-dual  interior-point,  method  may  not  have  enough  sepa¬ 
ration  between  the  primal  variables  and  the  dual  slacks.  For  example,  the  pair  y,), 
for  some  values  of  t,  is  of  order  (10— 6 , 10-4)  in  LOTFI  and  BANDM,  (10~2, 10-1) 
in  SCAGR25  and  (1(T4, 10~4)  in  both  SCAGR25  and  FFFFF800.  This  shows  that 
choosing  effective  thresholds  Sx  and  6y  for  a  given  set  of  problems  is  practically  im¬ 
possible.  It  is  also  worth  mentioning  that  in  SEBA  the  pair  (x,-,  j /,•),  for  some  values  of 
i  is  of  the  order  ( 10— 3 ,  10-1)  while  for  a  different  value  of  i  it  is  of  order  ( 10— 4 , 10-2). 
This  shows  that  choosing  these  thresholds  is  practically  impossible  even  for  variables 
in  the  same  problem.  It  is  interesting  to  observe  that  in  problem  NESM  the  algorithm 
terminated  with  a  certain  pair  equal  to  (1.543,0.000015);  which  gives  the  impression 
that  the  dual  slack  is  zero  and  the  primal  variable  is  nonzero  at  the  solution.  Solving 
the  problem  to  an  accuracy  of  10-15  reduced  the  value  of  Xi  to  0.21  x  10-9  while  the 
value  of  the  dual  slack  remained  the  same,  see  Table  1.  In  Table  1,  the  letter  N  in  the 
last  row  means  that  we  could  not  solve  the  problem  to  an  accuracy  of  10-16.  This 
example  shows  how  misleading  the  variables  can  be. 

Some  authors  propose  choosing  bzero  adaptively.  Although  this  idea  may  slightly 
improve  the  results  obtained  by  using  the  variables  as  indicators,  we  believe  it  will 
not  account  for  that  much  improvement.  The  reason  is  that,  as  mentioned  in  Section 
4,  at  an  approximate  solution  generated  by  a  primal-dual  interior-point,  method  the 
variables  x,-  with  i  £  Z(x*)  have  small,  but  not  zero,  values.  In  fact,  we  observed  that 
the  algorithm  may  terminate  with  some  of  these  values  relatively  large.  In  several 
cases,  some  of  these  values  were  larger  than  values  of  the  positive  variables,  e.g. 
GREENBEA  and  NESM.  This  phenomenon  implies  that,  at  least  for  these  problems, 
the  choice  for  6zero  is  practically  impossible.  One  may  then  suggest  that  we  solve 
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Table  8.1 

NESM:  A  -particular  solution  pair 


relative 

gap 

X 

V 

TAPIA 
For  x 

PRIMAL 

DUAL 

28 

10-6 

21.793 

1.58D-5 

0.878 

0.12D+7 

29 

10-6 

17.905 

1.57D-5 

0.784 

0.10D+7 

30 

10-7 

7.946 

1.58D-5 

0.443 

0.50D+6 

31 

10-8 

1.543 

1.59D-5 

0.26D-2 

0.13D+4 

32 

10-8 

0.2648 

1.59D-5 

0.14D-3 

0.14D+2 

33 

10~9 

0.35D-1 

1.59D-5 

0.57D-5 

0.96D-1 

34 

10-11 

0.13D-2 

1.59D-5 

0.25D-6 

0.45D-3 

35 

10-12 

0.44D-4 

1.59D-5 

0.19D-7 

0.16D-5 

36 

10-13 

0.18D-5 

1.59D-5 

0.26D-8 

0.47D-8 

37 

10“15 

0.21D-9 

1.59D-5 

0.19D-9 

0.16D-12 

38 

10” 18 

N 

N 

N 

N 

the  problem  to  a  greater  accuracy  so  that  there  is  a  clear  distinction  between  zero 
and  nonzero  variables.  Although  this  idea  is  conceptually  correct,  it  overlooks  three 
important  issues. 

1.  It  is  not,  generally,  known  a  priori  to  what  accuracy  a  particular  problem 
should  be  solved.  For  example,  we  solved  problem  D2Q06C  to  an  accuracy  of  10-11 
and  we  still  had  some  pairs  («i,  j/i)  of  order  (10-5, 10-4)  and  even  (10~4, 10-4).  An¬ 
other  example  is  problem  CYCLE  which  we  solved  to  an  accuracy  of  10-12  and  we 
had  pairs  of  order  (1(T9, 1(T5),  (1(T8,  KT6),  (IQ-8, 10"7)  and  (1(T7, 1(T7). 

2.  Since  the  linear  systems  we  are  solving  are  necessarily  singular  at  any  so¬ 
lution  for  degenerate  problems,  in  some  problems  these  systems  may  become  very 
ill-conditioned  near  a  solution  so  that  we  cannot  solve  the  problem  to  the  desired 
accuracy.  An  example  of  this  is  problem  NESM.  We  solved  this  problem  to  an  accu¬ 
racy  of  10~15.  We  observed  that  some  pairs  (a?, ,  j/i)  were  of  the  order  (10-7, 10-7). 
Unfortunately,  we  cannot  ask  for  more  accuracy  with  the  given  precision. 

3.  Finally,  even  if  we  know  the  recpiired  accuracy  and  we  are  able  to  solve  the 
problem  to  that  accuracy,  we  miss  one  of  the  main  objectives  of  the  indicators.  This 
objective  is  to  predict  zero  variables  as  early  as  possible  in  order  to  save  computational 
work. 

8.2.  The  Behavior  of  Several  Indicators.  This  experiment  compares  the 
ability  of  three  indicators,  the  variable  as  indicator,  the  Tapia  indicator  and  the 
primal-dual  indicator,  to  identify  zero  variables.  Naturally,  the  number  of  zero  vari¬ 
ables  predicted  by  each  indicator  will  depend  on  the  indicator’s  threshold  <5/  in  the 
identification  test 


Ii(xk)  <  5/  =>  x*  =  0. 

For  this  experiment  we  choose  6variabies  —  10-6,  &Tapia  —  0.1  and  6pr imai-duai  =  0.1. 
The  choice  of  this  value  of  primal- dual  is  based  upon  our  own  experience  that  if 
<5 primal-dual  >  1,  the  primal-dual  indicator  predicts  the  wrong  set  of  zero  variables 
more  often.  We  choose  6variabies  —  10-6  because  in  some  of  the  test  problems  the 
terminal  values  of  zero  variables  are  greater  than  10~e  at  the  approximate  solution 
when  the  problems  are  solved  to  an  accuracy  of  10-8;  which  is  the  default  choice  in 
our  algorithm  as  well  as  most  primal-dual  interior-point,  methods.  In  Table  2,  the  first 
column  gives  the  problem  name.  Corresponding  to  each  problem  the  second  column 
gives  the  name  of  the  indicator  used  to  predict  members  of  Z(x*).  Columns  3  to  9 
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give  the  quotient  i.e.  the  percentage  of  the  zero  variables  correctly  identified 

at  the  corresponding  iteration.  These  columns  correspond  to  the  last  7  iterations 
before  the  algorithm  terminated.  Here  M  is  the  total  number  of  iterations  required 
to  solve  the  problem  to  an  accuracy  of  10-8.  We  stress  that  we  count  the  predicted 
zero  variables  only  when  the  set  Zk  C  Z(x*),  i.e.  Zk  does  not  have  any  member  i 
with  the  corresponding  x*  positive  at  the  solution. 

For  the  set  of  problems  given  in  Table  2,  the  Tapia  indicator  shows  a  better 
ability  to  predict  zero  variables.  It  is  also  clear  that  the  ability  of  the  variables  to 
determine  Z(x*)  early  is  minimal.  An  obvious  example  is  problem  GROW7,  where 
the  variables,  used  as  indicators,  were  able  to  determine  only  one  element  of  Z(x*) 
when  the  algorithm  terminated.  All  the  problems  in  Table  2  are  solved  to  an  accuracy 
of  10~8.  It  is  worth  mentioning  here  that  at  iteration  7  in  problem  SCSD1  the  set 


Table  8.2 

Comparison  between  indicators 


PROBLEM 

INDICATOR 

M— 6 

M— 5 

M— 4 

M— 3 

M-2 

M— 1 

M 

AFIRO 

(M=8) 

VARIABLES 

0 

0 

0 

0 

0 

0 

100 

PRIMAL-DUAL 

0 

0 

0 

0 

0 

100 

100 

TAPIA 

0 

0 

3 

14 

93 

100 

100 

ADLITTLE 

(M=12) 

VARIABLES 

0 

0 

0 

0 

7 

78 

100 

PRIMAL-DUAL 

0 

0 

0 

0 

0 

0 

100 

TAPIA 

7 

9 

52 

79 

94 

100 

100 

SCSD1 

(M=9) 

VARIABLES 

0 

0 

0 

0 

0 

100 

100 

PRIMAL-DUAL 

0 

0 

0 

0 

0 

100 

100 

TAPIA 

0 

0 

25 

66 

99 

100 

100 

SHIP04L 

(M=17) 

VARIABLES 

0 

0 

11 

11 

11 

98 

100 

PRIMAL-DUAL 

0 

0 

0 

0 

0 

0 

100 

TAPIA 

0 

0 

97 

10 

100 

100 

100 

SHARE2B 

(M=12) 

VARIABLES 

0 

0 

0 

0 

0 

73 

100 

PRIMAL-DUAL 

0 

0 

69 

76 

81 

100 

100 

TAPIA 

3 

30 

63 

79 

97 

100 

100 

GROW  7 
(M=14) 

VARIABLES 

0 

0 

0 

0 

0 

0 

2 

PRIMAL-DUAL 

0 

0 

0 

0 

0 

98 

100 

TAPIA 

0 

0 

0 

0 

11 

96 

100 

Z 7  that  is  determined  by  the  primal-dual  indicator  contains  all  correct  zeros  as  well 
as  one  nonzero  variable.  This  shows  that  one  should  be  cautious  when  using  the 
primal-dual  indicator.  This  behavior  is  observed  in  other  problems  as  well. 

An  example  of  the  undesirable  behavior  of  the  primal-dual  indicator  for  some 
variables  in  problem  WOOD1P  is  shown  in  Figures  1.  In  Figure  1,  the  variable  (rep¬ 
resented  by  the  solid  line)  is  very  small  but  not  zero  at  the  solution  (the  optimal 
value  of  that  variable  is  1.1  x  10-5).  The  corresponding  primal-dual  indicator  (rep¬ 
resented  by  the  dotted  line)  has  very  small  values  (less  than  10~4)  for  12  iterations 
(the  algorithm  terminated  at  iteration  13)  giving  the  impression  that  this  variable  is 
zero  at  the  solution.  The  Tapia  indicator  for  the  same  variable  is  presented  by  the 
dashed  line.  The  indicator  accurately  predicts  very  early  that  the  terminal  value  of 
this  variable  is  not  zero.  Note  that  this  problem  is  very  well  behaved  in  the  sense  that 
the  variables  start  approaching  their  optimal  values  reasonably  early.  Figure  2  shows 
the  behavior  of  the  Tapia  indicator  for  all  the  variables  in  problem  SCSD1.  Although 
a  few  of  the  indicators  converge  to  their  terminal  values  late,  most  indicators  do  give 
the  correct  information  early.  Finally,  we  do  not  mean  to  imply  that  the  primal-dual 
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FlG.  8.1.  The  Tapia  and  primal-dual  indicators  for  a  small  nonzero  variable. 


indicator  always  follows  the  pattern  seen  in  Figures  1.  In  fact,  Figure  3  shows  an  ex¬ 
ample  in  which  the  primal-dual  indicator  performs  very  well,  again  here  the  solid  line 
represents  the  variable  and  the  dotted  line  represents  the  corresponding  primal-dual 
indicator.  However,  from  our  numerical  experience,  we  believe  that  much  care  should 
be  taken  when  the  primal-dual  indicator  is  used. 

Finally,  Table  1  gives  an  example  in  which  both  the  variables  and  the  primal-dual 
indicator  fail  to  give  the  correct  information  when  the  algorithm  terminated.  The 
first  column  of  that  table  gives  the  iteration  count.  The  second  column  gives  the  cor¬ 
responding  relative  gap.  The  last  four  columns  give  the  values  of  the  primal  variable, 
the  corresponding  dual  slack,  the  Tapia  indicator  and  the  primal-dual  indicator,  re¬ 
spectively,  for  a  given  variable  in  problem  NESM.  This  variable  is  zero  at  the  solution. 
We  note  that  when  the  algorithm  stops  at  iteration  31,  the  primal-dual  indicator  has 
a  very  large  value.  So,  it  fails  to  give  the  correct  information  at  this  iteration.  Note 
that  the  Tapia  indicator,  at  the  same  iteration,  correctly  indicates  that  this  variable  is 
zero  at  the  solution.  We  had  to  take  two  more  iterations  in  order  for  the  primal-dual 
indicator  to  give  the  correct  information. 

8.3.  A  Generic  Procedure.  In  the  following  we  introduce  a  generic  procedure 
to  identify  zero  variables  in  linear  programming  problems.  This  procedure  is  a  sample 
procedure  of  what,  in  our  opinion,  should  be  a  framework  for  zero- variable  identifica¬ 
tion  technicpies.  It  is  our  considered  opinion  that  any  effective  procedure  of  this  kind 
should  have  three  features 

(i)  An  effective  indicator. 

(ii)  A  good  way  of  handling  the  information  from  the  indicator. 
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Fig.  8.2.  The  Tafia,  indicators  for  all  variables  in  problem  SCSDl. 


(iii)  A  mechanism  to  detect  and  recover  if  an  error  is  made  in  predicting  members 
of  Z(x*). 

Procedure  8.1. 

At  iteration  k, 

(i)  Test  for  errors  in  Z(xk).  If  they  exist,  recover. 

(ii)  Test  indicators  —  Z(xk))  using  the  identification  criterion 


(8.1)  Ii  <  6. 

(iii)  Use  information  from  (ii)  to  update  the  estimate  of  Z(xk). 

We  emphasize  again  that  far  away  from  the  solution  set  of  the  linear  programming 
problem  all  indicators  give  only  heuristics  for  the  identification  of  zero  Z*  .  This  fact 
emphasizes  the  importance  of  a  recovery  procedure  if  an  error  is  made.  It  also  supports 
the  strategy  that  we  adopted  in  our  application  of  Procedure  8.1  on  the  netlib  set  of 
test  problems.  This  strategy  has  the  following  two  features 

(i)  We  used  two  indicators  to  confirm  information  obtained  from  each  of  them 
individually.  From  our  numerical  experiments  it  seems  beneficial  to  use  more 
than  one  indicator  (in  particular  if  they  are  not  expensive  to  compute). 

(ii)  We  adopted  a  conservative  strategy  in  using  the  information  obtained  in 
Step  (ii)  in  Procedure  8.1. 

Now  we  discuss  the  particulars  of  our  implementation  of  Procedure  8.1.  For  step  (ii) 
we  use  two  indicators,  the  sum  of  the  Tapia  indicators 
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Fig.  8.3.  The  primal-dual  indicator  for  a  positive  variable. 

and  the  primal-dual  indicator 

PDki  = 

where  x*+1  =  x*  +  Ax*  and  y*+1  =  y*  +  Ay*.  For  the  the  indicator  77*  we  used 
6ti  =  0.2  and  for  the  primal-dual  indicator  we  used  bprimai-duai  =0.1. 

For  step  (iii),  we  adopted  the  strategy  proposed  by  Tapia  [38].  This  strategy  was 
implemented  by  Vardi  [46]  in  the  context  of  nonlinear  programming  problems.  The 
idea  is  to  divide  the  set  {1,  .  .  . ,  n}  at  iteration  k  into  three  categories,  Z(xk)  consisting 
of  indices  corresponding  to  variables  that  are  predicted  to  be  zero,  Af Z(xk)  consisting 
of  indices  corresponding  to  variables  that  are  predicted  to  be  nonzero,  and  a  third 
category  U(xk)  —  {1, . . . ,  n)  —  ( Z(xk )  U  Af Z(xk))  consisting  of  indices  corresponding 
to  variables  that  we  feel  we  do  not  have  enough  information  about  to  move  them 
to  one  of  the  first  two  categories.  There  are  several  ways  to  specify  each  of  the 
three  categories  and  the  rule  to  move  a  variable  from  one  category  to  another.  In 
our  implementation  we  start  with  Af Z(x°)  =  Z(x°)  =  0  and  U(xl>)  —  {1, . .  ,,n}.  An 
index  i  is  moved  from  U{xk)  to  Z{xk)  if  the  identification  criterion  (8.1)  is  satisfied  for 
the  two  indicators  7)  and  PD,-  for  two  consequent  iterations.  If  (8.1)  is  not  satisfied 
for  at  least  one  of  the  two  indicators  for  more  than  one  iteration  then  i  is  moved 
from  U{xk)  to  Af Z(xk).  An  index  i  £  Af Z(xk)  moves  to  U(xk)  if  the  two  indicators 
satisfy  (8.1)  at  the  same  iteration.  This  procedure  was  tested  on  a  subset  of  the 
NETLIB  set  of  LP  test  problems.  The  results  were  quite  satisfying.  With  no  recovery 
technique  the  procedure  failed  in  12  problems  out  of  the  80  problems  tested.  When  a 


nk  + 1 


Vi 


t+1 
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simple  recovery  technique  was  used  the  procedure  failed  only  on  one  problem,  NESM, 
where  the  procedure  predicted  only  some  members  of  2(x*).  The  performance  of  our 
procedure,  for  some  problems  from  the  netlib,  is  given  in  Table  3.  The  first,  column 
of  this  table  gives  the  names  of  the  problem  solved  while  columns  2  to  10  give  the 
number  of  positive  variables  at  the  corresponding  iteration  after  the  zero  variables 
have  been  identified  and  removed.  Here  M  is  the  total  number  of  iterations  required 
to  solve  the  problem  to  an  accuracy  of  10“ 8 .  We  stress  that  for  the  results  in  Table  3, 
the  procedure  was  activated  when  the  relative  gap  was  smaller  than  10“'.  However, 
some  experiments  were  conducted  using  the  procedure  from  the  first  iteration  and 
the  results  were  also  promising,  although  in  this  case  we  had  to  be  more  conservative 
in  choosing  the  thresholds  bjapia  and  <5prtma;-dua(  •  The  total  numbers  of  iterations 

Table  8.3 

Identifying  zero  variables  for  some  test  problems 


PROBLEM 

00 

1 

£ 

M— 7 

M— 6 

M-5 

M— 4 

M— 3 

M— 2 

M— 1 

M  | 

FINNIS 

985 

545 

503 

482 

469 

407 

394 

390 

PILOT4 

1181 

899 

840 

840 

756 

667 

667 

611 

SCRS8 

1275 

1204 

1141 

1096 

553 

379 

329 

325 

DEGEN2 

595 

595 

595 

555 

543 

436 

70 

53 

53 

GFRD-PNC 

1149 

1149 

1149 

1011 

872 

842 

426 

407 

407 

CYCLE 

2139 

2139 

2139 

1865 

1655 

1495 

1360 

1265 

1260 

MAROS 

1906 

1906 

1906 

1832 

1806 

634 

493 

480 

476 

WOOD1P 

2395 

2288 

2181 

2151 

1767 

1418 

52 

39 

39 

SHIP12L 

5533 

5533 

5528 

5528 

5306 

5306 

3590 

726 

726 

GREENBEA 

5283 

5283 

5051 

1790 

1639 

1543 

1429 

1384 

1375 

required  by  the  ten  problems  listed  are  as  follows:  for  FINNIS  M=27,  PILOT4  M=35, 
SCRS8  M=26,  DEGEN2  M=14,  GFRD-PCN  M=18,  CYCLE  M=  26,  MAROS  M=24, 
WOOD1P  M=14,  SHIP12L  M=  17  and  GREENBEA  M=44. 

The  following  remarks  are  of  interest 

(i)  If  the  procedure  is  used  very  early  in  the  iterative  process,  the  algorithm  may 
converge  to  a  point  on  the  relative  boundary  of  the  solution  set,  i.e.  a  solution  with 
more  zero  variables  than  the  ones  in  the  relative  interior  of  that  set.  An  example  of 
this  phenomenon  is  problem  SCSD6.  When  the  identification  procedure  was  activated 
with  the  relative  gap  less  than  10“ 1 ,  the  final  approximate  solution  had  1168  zero 
variables.  When  we  used  the  identification  procedure  from  the  first,  iteration  t.he  final 
approximate  solution  had  1198  zeros. 

(ii)  We  noticed  that  identifying  and  removing  zero  variables  early  may  actually 
reduce  the  total  number  of  iterations  required.  Some  examples  are  given  in  Table  4. 

Table  8.4 

Saving  iterations  by  removing  zero  variables 


PROBLEM 

Total  number  of  iterations 
without  removing  zero  variables 

Total  number  of  iterations 
if  zero  variables  are  removed 

CYCLE 

26 

23 

SCSD6 

12 

10 

GREENBEA 

45 

44 

WOODlP 

15 

14 

9.  Concluding  Remarks.  This  study  focused  on  three  philosophically  distinct, 
indicators:  the  variables  as  indicator,  the  primal-dual  indicator,  and  the  Tapia  indi¬ 
cators.  Theorem  2.2,  Proposition  5.1  and  Proposition  6.2  describe  the  characteristic 
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behavior  of  these  three  indicators.  An  immediate  implication  of  Theorem  2.2  is  that 
the  undesirable  behavior  that  results  from  using  the  variables  as  indicator  cannot 
be  avoided.  Proposition  5.1  pinpoints  the  source  of  the  undesirable  behavior  that 
the  primal-dual  indicator  exhibits  in  interior-point  methods.  Proposition  6.2  states 
that  the  Tapia  indicator  retains  its  useful  properties,  when  used  with  primal-dual 
interior-point  methods,  under  essentially  the  same  conditions  that  guarantee  fast  lo¬ 
cal  convergence  of  the  duality  gap  sequence.  We  emphasize  that  convergence  of  the 
iteration  sequence  is  not  required  for  this  result.  Our  numerical  results  are  interesting 
and  present  a  solid  case  against  the  use  of  the  variables  as  indicator,  and  motivate  the 
use  of  the  Tapia  indicators  over  the  so-called  primal-dual  indicator.  In  conclusion,  we 
strongly  believe  that  the  use  of  indicators  can  be  a  very  useful  and  powerful  tool  and 
deserves  further  study. 
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A.  Appendix:  Existing  Indicators.  In  this  appendix  we  catalogue  various 
indicators  that  appear  in  the  literature. 

A.l.  The  Kojima  Indicator.  Kojima  [20]  proposed  a  method  to  be  used  in 
Karmarkar-type  methods  for  identifying  positive  variables.  The  Kojima  indicator  is 
defined  as  the  piecewise  linear  function: 

(A.l)  h(xk,p)  =  min{c?  +  C—^—  +  (Piij  +  -)p  :  j  ±  *},  »  =  l,...,n, 

n  n 

where  P  is  a  specific  matrix  of  the  form 

P  =  I  —  (AD)T  M  —  (l/n)eeT , 

where  e  =  (l,...,l)T,Misa  specific  mxn  matrix,  xk  is  a  strictly  feasible  point,  cp  — 
PDc  where  D  =  diag(xk),  and  p  is  a  parameter.  Kojima  proved  that  U(xk ,  p)  >  0  is 
a  sufficient  condition  for  x*  to  be  positive  for  any  solution  with  nonpositive  objective 
function.  On  the  other  hand,  it  is  not  clear  that  the  Kojima  indicator  satisfies  the 
sharp  or  the  uniform  separation  properties.  It  also  depends  on  an  auxiliary  parameter 
p  and  it  is  not  obvious  how  it  should  be  chosen.  It  was  noted  by  Kojima  that  this 
indicator  requires  primal  nondegeneracy.  The  test  proposed  by  Kojima  to  identify 
zero  variables  costs  0(mn)  arithmetic  operations  for  each  variable. 

A. 2.  The  Ye- Todd  Indicators.  Todd  [41]  proved  that  all  dual  optimal  solu¬ 
tions  are  contained  in  ellipsoids  that  can  be  generated  as  a  by-product  of  the  Kar- 
markar  algorithm.  Using  this  information,  he  proposed  an  indicator  that  can  identify 
a  subset  of  primal  variables  that  are  zero  at  every  primal  optimal  solution.  Ye  [47] 
rigorously  studied  this  idea  and  proposed  a  closely  related  indicator  in  a  closed  form. 
The  Ye- Todd  indicator  is 

(A. 2)  I(yk,ek)  =  yk  -ck  (Dk)~1qk 

where  Dk  a  positive  diagonal  matrix  with  <lk  as  its  diagonal  (usually  dk  —  xk),  qk 
is  the  diagonal  of  the  projection  matrix  Dk  AT  (A(Dk)2AT)~1  ADk  and  ek  >  0  is  a 
parameter  of  the  dual-slack  ellipsoid 

{y-\\Dk(y-yk)\\<ek} 
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that  contains  all  the  optimal  dual  slacks  y* .  The  parameter  ek  is  set  equal  to  the 
duality  gap  at  each  iteration.  Ye  proved  that  I(yk  ,tk)  >  0  implies  that  x*  —  0  in 
all  optimal  solutions  of  that  problem.  The  Ye- Todd  indicator  has  interesting  theo¬ 
retical  properties.  Unfortunately,  it  is  not  clear  that  it  satisfies  the  uniform  or  sharp 
separation  properties.  Also  it  is  expensive  to  compute  and  finally,  it  requires  primal 
nondegeneracy.  Anstreicher  [1]  proposed  a  modification  to  the  Ye- Todd  approach  to 
extend  it  to  problems  with  primal  degeneracy. 

Using  a  similar  approach,  Ye  and  Todd  [43]  described  a  path-following  algorithm 
for  convex  quadratic  programming  problems  which  uses  a  sequence  of  ellipsoids.  Each 
of  these  ellipsoids  contains  all  of  the  primal  and  dual-slack  solutions.  They  propose 
an  indicator,  using  these  ellipsoids,  to  identify  zero  variables  in  the  course  of  an 
interior-point  algorithm  for  linear  programming.  Although  their  indicator  has  very 
nice  theoretical  properties,  it  is  expensive  to  compute  at  each  iteration.  It  also  requires 
nondegeneracy  if  all  zero  variables  are  to  be  identified. 

A. 3.  Tlie  Kovacevic-Vujcic  Indicator.  In  an  attempt  to  accelerate  the  con¬ 
vergence  of  Karmarkar-type  methods,  Kovacevic-Vujcic  [22]  introduced  the  following 
indicator 

(A. 3)  I{xk,Axk 

where 


Kovacevic-Vujcic  proved  that  this  indicator  is  superlinearly  faster  than  the  vari¬ 
ables,  namely 

i,m  =  o, 

k  — ►oo  Hx*  —  :c*|| 


)  =  xk  +ak{xk+l  -xk), 


—  X, 


mm  — 

k- 4-i  t 


fc  +  1  ' 


where 


I*  =  lim  I(xk ,  Ax*). 

k—koo 

On  the  other  hand,  this  indicator  satisfies  neither  the  sharp  nor  the  uniform  separa¬ 
tion  property.  It  also  requires  dual  nondegeneracy.  Finally  we  note  that  while  the 
Kovacevic-Vujcic  indicator  may  be  of  use  in  Karmarkar-type  methods;  it  is  not  clear 
that  it  is  of  use  in  the  context  of  the  recent  primal-dual  interior-point,  methods.  The 
reason  is  that  in  the  more  effective  implementation  of  these  methods  the  step  is  taken 
to  the  boundary  asymptotically,  see  Zhang,  Tapia  and  Dennis  [52],  This  means  that 
the  indicator,  asymptotically,  will  coincide  with  the  iterate. 

A. 4.  The  Mehrotra  Indicator.  Mehrotra  [30]  consider  the  quantity 

(A. 4)  Ii(yk ,  Ayk ,ak)  =  ^  ~  ^  , 

Vi 

which  measures  the  relative  change  in  the  dual  slack  yk ,  as  an  indicator.  Mehro¬ 
tra  used  this  indicator  to  drop  constraints  when  implementing  a  dual  affine  scaling 
method. 
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A. 5.  The  Tapia- Zhang  Indicator.  In  an  attempt  to  uncover  an  optimal  basis 
of  the  linear  program  (1.1),  Tapia  and  Zhang  [39]  introduced  the  indicator: 

(A. 5)  I(dk)  =  diag[DkAT(A(Dk)2AT)-1ADk } 

where  Dk  is  the  diagonal  matrix  with  dk  as  its  diagonal.  The  vector  dk  can  be  xk ,  yk 
or  ( Yk)~lXke .  Assuming  primal  and  dual  nondegeneracy  Tapia  and  Zhang  proved 
that  this  indicator  satisfies  both  the  sharp  and  the  uniform  separation  properties, 
namely 

<A'6)  £m  /,(^)={  J 

They  also  proved  that  it  is  quadratically  faster  than  the  variables,  i.e. 

\\i(zk)-r\\<o(\\xk-x*\\2), 

where 

/*  =  lim  I(zk). 

fc  — *oo 

This  indicator  can  be  used  for  primal,  dual,  and  primal-dual  methods.  For  some  in¬ 
teresting  theoretical  properties  of  this  indicator  see  [39],  The  disadvantages  of  this 
indicator  are  that  it  requires  primal  and  dual  nondegeneracy  and  is  expensive  if  eval¬ 
uated  at  each  iteration. 

A. 6.  The  Resende-Veiga  Indicator.  Resende  and  Veiga  [36]  used  the  recip¬ 
rocal  of  the  dual  slacks  as  indicators,  namely 


(A.7) 


I(yk)  =  (Yk)~2e, 


where  Y  =  diag(y).  This  indicator  satisfies 


if  i  G  Z(xm) 
if  i  $  Z(x*) 


They  used  the  identification  criterion 

Ii(yk)<  10"3q  =p  xk  =  0, 


where  e«j  is  the  geometric  mean  of  the  arithmetic  and  harmonic  means  of  the  compo¬ 
nents  of  the  vector  (l/y2, . . . ,  1  /y2)-  The  use  of  this  indicator  requires  strict  comple¬ 
mentarity.  Unfortunately,  this  indicator  satisfies  neither  the  sharp  nor  the  uniform 
separation  property. 

A.7.  The  Choi-Goldfarb  Indicators.  In  an  attempt  to  generalize  Todd-Ye 
indicators  to  algorithms  that  are  not  necessarily  interior-point,  methods,  Choi  and 
Goldfarb  [5]  proposed  two  indicators  using  ellipsoids  that  contain  optimal  solutions 
of  the  linear  programming  problems.  As  the  case  with  the  Todd-Ye  indicators  the 
Choi-Goldfarb  indicators  are  expensive  to  compute  at  each  iteration.  Also  if  all  zero 
variables  (i.e.  all  members  of  Z(x*)  are  to  be  identified,  then  the  assumption  of  primal 
non  degeneracy  is  required. 
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